Comparative analysis of Diospyros (Ebenaceae) plastomes: Insights into genomic features, mutational hotspots, and adaptive evolution

Abstract Diospyros (Ebenaceae) is a widely distributed genus of trees and shrubs from pantropical to temperate regions, with numerous species valued for their fruits (persimmons), timber, and medicinal values. However, information regarding their plastomes and chloroplast evolution is scarce. The present study performed comparative genomic and evolutionary analyses on plastomes of 45 accepted Diospyros species, including three newly sequenced ones. Our study showed a highly conserved genomic structure across the Diospyros species, with 135–136 encoding genes, including 89 protein‐coding genes, 1–2 pseudogenes (Ψycf1 for all, Ψrps19 for a few), 37 tRNA genes and 8 rRNA genes. Comparative analysis of Diospyros identified three intergenic regions (ccsA‐ndhD, rps16‐psbK and petA‐psbJ) and five genes (rpl33, rpl22, petL, psaC and rps15) as the mutational hotspots in these species. Phylogenomic analysis identified the phylogenetic position of three newly sequenced ones and well supported a monophylogenetic (sub)temperate taxa and four clades in the pantropical taxa. The analysis codon usage identified 30 codons with relative synonymous codon usage (RSCU) values >1 and 29 codons ending with A and U bases. A total of three codons (UUA, GCU, and AGA) with highest RSCU values were identified as the optimal codons. Effective number of codons (ENC)‐plot indicated the significant role of mutational pressure in shaping codon usage, while most protein‐coding genes in Diospyros experienced relaxed purifying selection (d N/d S < 1). Additionally, the psbH gene showed positive selection (d N/d S > 1) in the (sub)temperate species. Thus, the results provide a meaningful foundation for further elaborating Diospyros's genetic architecture and taxonomy, enriching genetic diversity and conserving genetic resources.


| INTRODUC TI ON
Diospyros (Ebenaceae) is a genus well-known for hardwood, delicious fruits and medicines (Lee et al., 1996;Lin et al., 2020;Luo et al., 2021;Wallnöfer, 2001;White, 1956). Diospyros is the largest genus of the Ebenaceae family, with about 500 evergreen or deciduous shrub and tree species distributed in tropical and temperate regions (Lee et al., 1996;The Plant List, 2002). The genus is characterized by male cymose inflorescence, solitary female flowers, fleshy berries with enlarged persistent calyx at the base, and a dioecious breeding system (Lee et al., 1996). However, the morphological similarities make it difficult to distinguish the species, hindering research and economic development.
Previous studies found that Diospyros belongs to the Ebenoideae subfamily (Ebenaceae) and is closely associated with Euclea Murray and Royena L. (Duangjai et al., 2006(Duangjai et al., , 2009Fu et al., 2016;Li et al., 2018;Linan et al., 2019;Samuel et al., 2019). Within the genus, about 11 (or 12) clades were supported by molecular phylogenetic studies based on multilocus or genomes (Duangjai et al., 2006(Duangjai et al., , 2009Linan et al., 2019). A previous study has established that the island Diospyros species have been shaped by ancestral bottlenecks, rapid and recent radiations in phenotypic characters, and repeated and convergent evolution of potentially adaptive traits during the diversification (Fernández-Mazuecos et al., 2020). The island Diospyros taxa (New Caledonia) also experienced similar evolutionary pressure (Turner et al., 2016). Studies of Diospyros about macroevolution of migration among the continents indicated its complex evolutionary history of species diversity (Duangjai et al., 2009;Linan et al., 2019).
However, there is little attention paid to adaptive evolution of some Diospyros species (clade) shifting along latitude in different climatic zones. Unlike the pantropical clades of Diospyros, previous studies showed that the (sub)temperate clade is distributed in higher latitude zones from subtropical to temperate including D. virginiana L., D. kaki Thunb., D. lotus L., and so forth (Duangjai et al., 2009;Linan et al., 2019;Yonemori et al., 2008). In order to adapt to environmental conditions of high-latitude or high-elevation, Diospyros taxa of (sub)temperate clade are usually deciduous with broad thinleathery leaves (Lee et al., 1996;Tang et al., 2014). There are already examples of selective pressures associated with habitats seem to have caused the rapid evolution of genes involved in cold response in Cardamine (Ometto et al., 2012), in high-altitude response in Dysosma (Ye et al., 2018) or in sunlight preferences in Oryza . Therefore, on the basis of previous molecular phylogenetic researches, it is of great significance to study the adaptive evolution of Diospyros along latitude in different climatic zones by using new molecular markers such as plastomes.
The structurally stable and maternally inherited plastomes with low recombinant levels play a pivotal role in phylogenetic and evolutionary studies (Jansen et al., 2007;Wicke et al., 2011;Xia, Liao, et al., 2022;Xia, Liu, et al., 2022). The genes in plastomes primarily encode proteins related to photosynthesis and other biochemical pathways, including starch storage, nitrogen and sulfate metabolism, and chlorophyll, carotenoid, or fatty acid synthesis (Mohanta et al., 2020;Wicke et al., 2011). Moreover, plastomes are conserved in terms of genomic structures and substitution rates among most Angiosperms, which make plastomes a widely used molecular marker. Additionally, several studies have detected positive selection signals in plastid genes during evolution. For example, accelerated evolutionary rates of matK (Maturase K) in the low-altitude and recently derived lineages of Dysosma are related to the adaptation of the genus to highaltitude environments (Ye et al., 2018). Furthermore, analysis of the d N /d S ratios of Cardamineae suggested positive selection on the ycf2 (hypothetical chloroplast RF21) gene in watercress, possibly allowing the species to adapt to specific living environments (Yan et al., 2019).
Most plastid genes are under selection pressure due to their significant roles in maintaining essential cellular functions and, therefore, often retain the adaptive characteristics during evolution (Wicke et al., 2011). Additionally, the codon usage bias in plastomes serves as a suitable strategy for identifying the principal evolutionary driving forces (Gao et al., 2022;Jiang et al., 2014;Kapralov & Filatov, 2007).
For example, the effective number of codons (ENC)-plot showing deviations from the expected curve for a few genes suggested that apart from natural selection, mutational pressure also played a major role in shaping codon usage in Helianth us annuus (Gao et al., 2022). These findings have demonstrated that the genetic diversity in plastomes provides useful information about plants' adaptive evolution.
The present study tried to study the adaptive evolution of Diospyros using plastomes. We assembled plastomes of 45 accepted  (3) perform the phylogenetic analysis for Diospyros species identification using the complete plastomes and (4) analyze the codon usage bias of plastid genes and scan for candidate genes that could evolve under Darwinian selection in different climatic zones.

| DNA extraction
The plastomes of three Diospyros species (Figure 1)

| Genome sequencing, assembly, and annotation
Approximately 1 μg of the extracted DNA with a concentration higher than 12.5 ng/μL was used for plastome sequencing at the Beijing Genomics Institute (BGI). Before sequencing, total DNA was sheared into fragments shorter than 800 bp. The DNA fragments' quality was evaluated using Agilent Bioanalyzer 2100 (Agilent Technologies), and the pooled library was sequenced on an Illumina HiSeq X10 platform to obtain 150 bp long raw reads.
The raw reads were filtered by removing the sequences with a Phred score lower than 30, and the remaining ones were used for genome assembly using GetOrganelle toolkit (Jin et al., 2020).
The command lines used for the assembly were as follows: get_or-ganelle_reads.py -1 forward.fq -2 reverse.fq -o plastome_output -R 15 -k 21,45,65,85,105 -F plant_cp. The newly sequenced plastomes of Diospyros species were annotated with Geneious Prime 2021 (Biomatters), using the plastome sequence of D. virginiana L. (GenBank accession no. MF288577) as the reference. The CPGAVAS2 web server (http://www.herba lgeno mics.org/cpgavas) predicted the types and structures of all the protein-coding and noncoding genes in the plastome. The location of the start and stop codons, exon-intron boundaries, and the tRNA gene length and types were confirmed by comparing the annotation results from CPGAVAS2 and Geneious Prime 2021. Finally, the plastome maps for the newly sequenced species were drawn using the online tool OrganellarGenomeDRAW (Lohse et al., 2007). Plastomes of 42 other Diospyros species and two outgroups (Manilkara zapota: MN295595 and Camellia japonica: MG543990; Table 2, Figure 7) were downloaded from NCBI GenBank repository and re-annotated using the earlier method. According to the climatic zones of Diospyros species, it can be divided into (sub)temperate taxa (eight species) and pantropical taxa (37 species; Table 2).

| Plastome comparison
The GenBank accession numbers of the plastomes of the 45 Diospyros species used for comparative analyses are shown in Table 2. The plastome sequences of these 45 Diospyros species were aligned using the LAGAN model implemented in the mVISTA software to evaluate the degree of variation (Frazer et al., 2004), using default parameters and D. eriantha as the reference. The rearrangement in the sequences was detected using the whole genome alignment tool Mauve implemented in Geneious Prime 2021 (Darling et al., 2004).

| Detection of repeated sequences
Repeated sequences are essential components of the gene regulatory network; they are identical or complementary nucleotide fragments distributed throughout the genome. Two large families of repeated sequences, the dispersed repeated sequence (DRS, including forward, reverse, complement, and palindromic sequences) and the tandem repeated sequences (TRS, known as satellite DNA), can be readily recognized based on their distribution pattern in the genome (Sperling & Li, 2013). The satellite DNA refers to the repetitions of short sequences of the DNA and is of three types: macrosatellites, minisatellites, and microsatellites (simple sequence repeats or SSRs; Hoy, 2013). The DRS in the plastomes of 45 Diospyros species were predicted with REPuter (Kurtz et al., 2001), and the forward, reverse, palindromic, and complementary repeat sequences were identified using the following parameters: length of repeat unit ≥30 bp, sequence consistency ≥90% (Hamming distance = 3). Meanwhile, the tandem repeats finder (TRF) web server (https://tandem.bu.edu/trf/ trf.html) was used to search for TRS in the plastomes using default settings (Benson, 1999), and the MISA software to identify SSRs (Beier et al., 2017), with the minimum length of SSR fragment set to 10 bp and the minimum repetition threshold values for mono-, di-, tri-, tetra-, penta-, and hexanucleotide set to 10, 5, 4, 3, 3, and 3, respectively. Finally, all the detected repeat sequences were manually checked and corrected to remove the redundant ones.

| Analysis of codon usage
Codon usage bias refers to the unequal usage of synonymous codons in genetic material (Guo et al., 2017;Hershberg & Petrov, 2008;Plotkin & Kudla, 2011). For codon usage analysis, protein-coding sequences longer than 300 bp with ATG as the start codon were isolated from each plastome. CodonW (http://codonw.sourc eforge.net) analyzed the number and types of codons encoding the proteins and calculated the effective number of codons (ENC), the relative synonymous codon usage (RSCU) and the GC3 (guanine and cytosine content at the third codon position) values. Further, the effect of base composition on codon usage bias was evaluated by ENC plotting, with ENC and GC3 values along the y-axis and x-axis. The observed ENC value was compared with the expected ENC value using the following equation (Wright, 1990): The effects of gene mutation and natural selection on codon usage bias were evaluated by PR2 plotting with [A3/(A3 + T3)] and [G3/(G3 + C3)] along the y-axis and x-axis: this plot reflects the potential biased usage of A/T and G/C in the third codon position.

| Analysis of genetic diversity and selective pressure
The plastomes were aligned using the MUSCLE alignment software implemented in Geneious to screen for the highly divergent regions among the 45 Diospyros species (Edgar, 2004). The protein-coding genes, noncoding genes, and the intergenic regions were extracted from the plastomes to analyze the nucleotide diversity (Pi) among the Diospyros species using DnaSP (v5.0; Librado & Rozas, 2009) based on the number of overall mutation and the average nucleotide variation. Then, to evaluate the effect of environmental pressure on the evolution of Diospyros species, the nonsynonymous (amino acid-altering) to synonymous (silent) substitution rate ratio (ω = d N /d S ) of all the annotated proteincoding gene sequences in the plastomes across the phylogeny were calculated, with ω = 1, <1 (especially <0.5), and >1, indicating neutral evolution, purifying selection, and positive selection, respectively (Kimura, 1983;Yang & Nielsen, 2002). The branch-site model (Yang & Nielsen, 2002) was identified positively selected loci from genes in the foreground branch using sequences of M. zapota (Sapotaceae).
Furthermore, to examine the selective pressure on the whole plastid genes with different functions, the CDS genes were classified into photosynthesis-related, self-replication-related, and other functional genes (Table S1). To examine if the d N /d S values of CDS genes according to different functional classifications or taxa were significantly different, one-way analysis of variance (ANOVA) or Mann-Whitney U test was performed based on Shapiro-Wilk normality test and Levene test with least significant differences at p = .05.
Finally, boxplot graphs of the d N /d S values of CDS genes were generated according to different functional classifications or taxa and labeled the significance of the difference between the groups. All analyses were conducted in R version 4.3.0 (https://www.R-proje ct.org/).

| Phylogenomic inference
Phylogenetic analyses were conducted using the complete chloroplast genome sequences of 45 Diospyros species, excluding one copy of the inverted repeat, with M. zapota (Sapotaceae, a sister to Ebenaceae) and Camellia japonica (Theaceae) as outgroups TA B L E 1 Geographic information and specimen voucher number of the Diospyros species sequenced in this study.

TA B L E 2
The features of plastomes of 45 Diospyros species and 2 outgroup species.

Climatic zone
Total (bp)

| Genome structure and nucleotide variation
The three newly generated Diospyros plastome sequences have been deposited in the GenBank (OP480008, OP480009, OP485441; Table 1). Similar to most angiosperm, these three Diospyros species have plastomes with a classic tetrad structure, with two inverted repeats (IR) separated by a large single copy (LSC) region and a small single copy (SSC) region ( Figure 2 (Table 2). A total of 135-136 genes, including 89 protein-coding genes, 1-2 pseudogenes, 37 tRNA genes, and 8 rRNA genes were identified in these species, among which 10 protein-coding genes, 7 tRNA genes, and 4 rRNA genes were repeated in the two IRs (Table S1). The ycf1 in the IRb of all Diospyros species (a short Ψycf1) and the rps19 in the IRa region in most Diospyros species (a short Ψrps19) were identified as pseudogenes (Table S1). Six tRNAs and nine kinds of protein-coding genes had one intron, while the clpP, ycf3, and rps12 genes had two (Table S1). The matK gene was found embedded in the intronic region of trnK-UUU, consistent with various other plant taxa. Meanwhile, the trans-spliced rps12 gene, with the 5′ and 3′ ends located in the LSC and IR, had two independent transcription units.
Multiple plastome comparisons among the Diospyros species using mVISTA and Mauve alignment showed a high degree of collinearity. The gene organization and distribution patterns in the plastome were highly consistent among the Diospyros species ( Figure S1).

TA B L E 2 (Continued)
inversion or translocation, was detected among Diospyros plastomes sequences ( Figure S1). However, slight differences were observed in different regions throughout the plastome sequence.
The sequence similarity among Diospyros plastomes sequences was much higher in the two IRs, especially the rRNA coding regions. By contrast, the nucleotide mutation rate was high in the noncoding regions, especially the intergenic spacer (IGS) regions ( Figures S1 and S2).
Contraction and expansion of IR indicate plastome evolution and are correlated with plastome size. The present study found con-

| Repetitive sequences in plastomes
REPuter identified 2988 repeated sequences, including 16-31 forward repeats, 18-35 palindromic repeats, 0-3 reverse repeats, and 18-34 tandem repeats, in the 45 Diospyros species (Tables S3 and   S4, Figure 4). Among the species, D. olen had the maximum (99) forward, palindromic, reverse, and tandem repeats. Tandem repeats were more prevalent and accounted for 37.08% of all the repeat types. On the contrary, reverse repeats were relatively rare and accounted for only 0.23% of the repeat types (Table S4). The length of  and were located in the IGS of trnG-UCC and trnR-UCU (Table S3).
In addition, 19.91% of the SSRs were found in the CDS, while the other 80.09% were found in the introns and IGS (Tables S3 and S4, Figure 5).
Further analysis revealed high variability in the gene spacer, with a Pi value significantly higher than that of the gene-coding region (CDS; Figure 6). These findings suggest that hypervariable DNA fragments between the different Diospyros species could be used as ptDNA barcodes for taxonomic classification, species discrimination, and phylogenetic reconstruction and inference.

| Phylogenetic inference
Using Sapotaceae and Theaceae as the outgroups, ML and BI trees  However, we also found that the self-replication gene rpl23 and the photosynthesis gene psaI (d N /d S > 1) were under strong positive selection in both pantropical and (sub)temperate taxa. Remarkably, the d N /d S value of the photosynthesis gene psbH was higher than one only in the (sub)temperate taxa (Figure 8c, Table S6a). For species in the pantropical and (sub)temperate taxa, the d N /d S values of photosynthesis-related genes were significantly lower than self-replication-related and the other genes, suggesting stronger purifying selection (Figure 8a, Table S6b). However, there were no

| Selective pressure in CDS genes
photosynthesis-related and the other genes between pantropical and (sub)temperate taxa, respectively (Figure 8b, Table S6c).

F I G U R E 7
were recognized as the second and third optimal codons, respectively (Table S7). On the contrary, cysteine (Cys) was the least used amino acid (1.06%), but serine (Ser) encoding codon AGC had a minimum RSCU value of 0.32 (Table S7). In addition, AUG and UGG encoding methionine (Met) and tryptophan (Trp) had an RSCU value of 1, indicating no bias in the codon usage for these two amino acids (Table S7). Moreover, 30 codons had an RSCU >1, of which 16 had U in its third position, 13 had A, and one had G, which indicates that the codons ending with U or A are preferred in the Diospyros plastomes (Table S7).
Further, the ENC-GC3 plot was obtained by taking the ENC value of each gene as the ordinate and the GC3 value as the abscissa to explore the kind of suffered stress (mutation pressure or natural selection; Figure 9). The ENC value ranged from 32.36 to 60.62 and the GC3 value from 0.143 to 0.346 (except ycf 68: 0.466-0.550; Table S8). Figure 9a shows that most genes are close to the standard curve, and a few are far below it, indicating the influence of mutation pressure and natural selection on the codon usage bias of Diospyros genes (Figure 9). Then, to accurately evaluate the difference between the observed value (ENC obs ) and the expected value (ENC exp ) of ENC, the (ENC exp − ENC obs )/ENC exp ratio was calculated (Table S6). The ENC frequency ranging from −0.1 to 0.1 indicated a slight difference between ENC exp and ENC obs values of most genes.
The difference values in the codon usage bias of Diospyros genes was related to the difference in GC3, indicating a significant influence of mutation pressure on codon usage bias. Detailed analysis showed considerable deviation in the observed ENC values from the standard curve for five genes (rps18, rps14, psbA, rpl16 and ycf3) of all the species (Figure 9a,b). Among all the genes, ycf3 showed the highest ENC value, while rps18 and rpl16 had the lowest (Figure 9b; Table S8). PR2 plot showed slight disequilibrium in A/T and G/C usage in the third codon position of CDSs of the 45 Diospyros plastomes, especially these five CDSs (accD, psbA, psbD, rpl16 and rps14; Figure 9c). More genes were distributed in Quadrant IV (at the right bottom of the Figure 9c) than the other three quadrants, indicating frequent use of G and T in the third codon position, particularly in gene accD. This observation suggests that the existing codon usage pattern may be due to the combined action of natural selection and mutation. Genes from pantropical and (sub)temperate species were presented using different colors in the ENC and PR2 (parity rule 2) plots ( Figure 9).
There were no obvious potential differences in the main driving force of codon usage bias in Diospyros species between these two groups ( Figure 9; Table S8). Diospyros species based on eight plastid regions. Eleven clades were recognized but with relatively weak support rate in some clades (Duangjai et al., 2009). Recently, researchers have discussed using plastomes as super-barcodes for phylogenetic studies (Li et al., 2021;Wang et al., 2017). The phylogenetic analysis of this study showed that the plastomes are helpful as a super-barcode for the phylogeny relationship of Diospyros species (Figure 7). Two super-clades which contained clade III-IV and clade VII-XI-IX were strongly supported (BS = 100% and PP = 1), which were 79% and 77% in the mostparsimonious trees, respectively (Duangjai et al., 2009). The present phylogenetic analyses revealed that clade III, VII, XI, and IX were all totally monophyletic, with clade VII forming the basal clade of clade XI and IX, strongly supporting the previous phylogenetic analysis (Duangjai et al., 2009).

| Phylogenetic relationship of Diospyros species
The present study found the topology of (sub)temperate Diospyros species was consistent with an earlier plastome-based study . However, we carried out the phylogenetic analysis using more samples and showed that revealed reliable results with greater precision. In the plastome-based tree, D. zhejiangensis  (Lee et al., 1996), which is similar to Tang et al. (2014). In addition to the similar phylogenetic relationships among the three species, D. morrisiana has relatively smaller leaves and fruits than D. glaucifolia and D. lotus (Lee et al., 1996). Meanwhile, D. virginiana was identified as the basal taxa of the (sub)temperate clade. The fruits of D. virginiana are an important food for wildlife, native people, and Euro-American colonists.
These fruits have never been commercialized, despite the selection of superior clones over the years (Boufford, 2022). Therefore, D. virginiana, as the base group of deciduous group and its wild existence, can be used as a species for cultivation and breeding.
In the pantropical taxa, D. eriantha and D. strigosa are clustered together based on new plastomes sequences, consisting with their similarities in morphological characteristics. However, the newly sequenced D. strigosa did not form a cluster with D. strigosa MF179495.
We think that the latter sequenced plastome of D. strigosa MF179495 (collected from Yunan, China; Yu et al., 2017) was most likely misidentified, considering that D. strigosa is found only in Hainan, China (Lee et al., 1996). In the clade VII and XI, we recognized phyloge-  (Duangjai et al., 2009;Turner et al., 2016). Elucidating the boundaries between the different Diospyros species would improve our understanding of the cultivated species' origin, phylogeny and help decide the breeding strategy. However, except the (sub)temperate taxa, the pantropical taxa contains inadequate species in this study, and the phylogeny positions of some species are unresolved with week support (see Clade III&XI; Figure 7). In the future, the plastome-based or nuclear genome-based phylogenomic tree needs to be further studied based on more extensive sampling in Diospyros.

| Adapted evolution of Diospyros plastomes
Furthermore, we found that the d N /d S values of 78 common genes among the 45 Diospyros species were less than one. We also found that the d N /d S values of photosynthesis-related were significantly lower than self-replication-related and other genes in the (sub)temperate and pantropical taxa (Figure 8). This observation indicated that most important photosynthesis-related genes are undergoing strong purifying selection. Purifying selection usually reduces genetic diversity and maintain gene homozygosity via the selective removal of deleterious alleles (Cvijović et al., 2018). Moreover, the functional importance of a protein determines its evolutionary rate  Typically, the usage pattern of the third base of the codon is closely related to codon usage bias (Gao et al., 2022). The GC composition is closely related with codon and amino acid usage, and the GC content of the third base of a codon (GC3) reflects codon usage patterns (Chen et al., 2013). Studies have shown that dicots and monocots prefer to use A/U and C/G as ending codons, respectively (Liu et al., 2020;Yao et al., 2008 Mutation pressure and natural selection are the major factors influencing codon usage bias in any organism (Rao et al., 2011;Sharp et al., 2010). However, the main factors affecting codon usage bias vary significantly among species. According to the parity rule 2 analysis, the GT content at the third position of a codon is higher than AC content. However, A and T were used more frequently than G and C in the third position of the codons of Diospyros genes (Table   S8), which suggested natural selection as one of the main reasons for Diospyros codon usage bias. Further ENC-plot analysis showed that the ENC value of most genes was close to the expected value ( Figure 8a), suggesting that the codon usage bias of these genes was related to GC3, and mutation was the main influencing factor.
Additionally, a few genes in the plot (rpl16, rps18 and rps14) were well below the expected curve (Figure 8b bias (Shen et al., 2021). However, Sheng et al. (2021) reported natural selection as the main factor influencing codon usage bias of five different Miscanthus species. These results suggest that various pressures influence plastomes, and codon usage preferences of plastome genes vary among the dicotyledon taxa.

| Potential ptDNA barcodes and phylogeny of Diospyros
Taxonomic classification is challenging in Diospyros (Lee et al., 1996). Moreover, the worldwide distribution and phenotypic plasticity make it difficult to identify the wild Diospyros species (Ebenaceae; Lin et al., 2020). Generally, in such cases barcodes are used. However, only a limited number of DNA barcodes (e.g., rbcL, matK, and trnH-psbA) are available to resolve the phylogenetic relationships among the groups (Duangjai et al., 2009;Linan et al., 2019). Therefore, comparing more plastomes for developing variable DNA barcodes is important for Diospyros species.
Generally, the mutational hotspots have the potential to resolve taxonomic issues. They provide adequate genetic information for species identification and, therefore, can be used to develop novel DNA barcodes. The four potential mutational hotspots (ccsA-ndhD, trnT-trnL, rps16-psbK, petA-psbJ) identified in this study could be suitable barcodes for Diospyros classification. In addition, five other potential mutational hotspots (rpl33, rpl22, petL, rps15 and ycf1) were identified with high nucleotide polymorphisms in CDS.

| CON CLUS ION
The present study analyzed the plastome sequences of 45 Diospyros species and performed phylogenetic analysis to provide valuable genetic information. The findings based on this analysis partially supported the previous classifications based on morphological features. In addition, the study offers new insights into the phylogenetic relationships between (sub)temperate and pantropical taxa.
Comparative plastome analysis revealed conserved genome structures and low nucleotide polymorphism. This study also identified mutational hotspots as phylogenetically informative markers that will contribute to future studies on Diospyros systematics and species identification. Moreover, this study assessed the adaptive evolution of the two taxa in Diospyros for the first time using d N /d S values, ENC-plot, and PR2 plot. This integrated analysis revealed natural selection and mutation pressure as the driving forces of plastomes of 45 Diospyros species. However, because of insufficient sampling, the current study did not provide adequate genetic information for understanding phylogenetic relationship and adaptive evolution for the entire Diospyros genus. Thus, we should focus on a more extensive sampling in future research.

ACK N OWLED G M ENTS
The authors thank Dr. Zu-Lin Ning from Horticultural Center, South China National Botanical Garden in Guangzhou, for the help in the material collection.

FU N D I N G I N FO R M ATI O N
This study was funded by the National Science Foundation of China (31800309), and the Zhejiang Provincial Public Welfare Technology and Application Research Project (LGN21C020007).

CO N FLI C T O F I NTER E S T S TATEM ENT
None declared.

DATA AVA I L A B I L I T Y S TAT E M E N T
The Diospyros plastomes generated in this study are available in the NCBI GenBank repository with accession numbers OP480008, OP480009 and OP485441. The supplementary material can be download all files using the URL: https://datad ryad.org/stash/ share/ JglK1 pjqcc r7Fyl xl45H tlXMz tzhrq oEaXF fbZktSCo.